Transaction-filtering data mining and a predictive model for intelligent data management
نویسنده
چکیده
ChenHan Liao, AMAC Group, School of Engineering, Cranfield University i This thesis, first of all, proposes a new data mining paradigm (transaction-filtering association rule mining) addressing a time consumption issue caused by the repeated scans of original transaction databases in conventional associate rule mining algorithms. An in-memory transaction filter is designed to discard those infrequent items in the pruning steps. This filter is a data structure to be updated at the end of each iteration. The results based on an IBM benchmark show that an execution time reduction of 10% 19% is achieved compared with the base case. Next, a data mining-based predictive model is then established contributing to intelligent data management within the context of Centre for Grid Computing. The capability of discovering unseen rules, patterns and correlations enables data mining techniques favourable in areas where massive amounts of data are generated. The past behaviours of two typical scenarios (network file systems and Data Grids) have been analyzed to build the model. The future popularity of files can be forecasted with an accuracy of 90% by deploying the above predictor based on the given real system traces. A further step towards intelligent policy design is achieved by analyzing the prediction results of files’ future popularity. The real system trace-based simulations have shown improvements of 2-4 times in terms of data response time in network file system scenario and 24% mean job time reduction in Data Grids compared with conventional cases.
منابع مشابه
A Proposed Model to Identify Factors Affecting Asthma using Data Mining
Introduction: The identification of asthma risk factors plays an important role in the prevention of the asthma as well as reducing the severity of symptoms. Nowadays, the identification process can be performed using modern techniques. Data mining is one of the techniques which has many applications in the fields of diagnosis, prediction, and treatment. This study aimed to identify the effecti...
متن کاملIntelligent Approach for Attracting Churning Customers in Banking Industry Based on Collaborative Filtering
During the last years, increased competition among banks has caused many developments in banking experiences and technology, while leading to even more churning customers due to their desire of having the best services. Therefore, it is an extremely significant issue for the banks to identify churning customers and attract them to the banking system again. In order to tackle this issue, this pa...
متن کاملDesigning an intelligent system for predicting chromosomal genetic diseases using data mining
Background and Aim: Today we are witnessing tremendous advances in medical data mining. The data, by analyzing and discovering the relationships between them, can lead to algorithms that help us prevent or treat many diseases. Meanwhile, genetic diseases have attracted a large part of the attention of the medical world because the birth of children with genetic disorders imposes a great financi...
متن کاملارائه رویکردی برای مدیریت و سازماندهی اسناد متنی با استفاده از تجزیهوتحلیل هوشمند متن
Regarding the fact that stored data occupies a large space in organizations and retention systems and information management that has been resulted in gigantic data warehouses, the need for extracting an appropriate model is felt increasingly. Text mining is one of the most significant methods for extracting a useful and appropriate model that helps organizations in achieving their goals throug...
متن کاملUsing Data Mining Techniques for Intelligent Diagnosis of Severity of Depressive Disorder
Introduction: Implementing a method that can help individuals diagnose or prevent mental disorders can be an important step in preventing and controlling these disorders especially in the early stages. The objective of this research was to apply data mining techniques for intelligent diagnosis of severity of depressive disorder. Method: The present applied research was carried out by going to a...
متن کامل